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Abstract. The strong dependence of the large-scale dark matter halo bias on 
the (local) non-Gaussianity parameter, /nl, offers a promising avenue towards 
constraining primordial non-Gaussianity with large-scale structure surveys. In 
this paper, we present the first detection of the dependence of the non-Gaussian 
halo bias on halo formation history using iV-body simulations. We also present an 
analytic derivation of the expected signal based on the extended Press-Schechter 
formalism. In excellent agreement with our analytic prediction, we find that 
the halo formation history-dependent contribution to the non-Gaussian halo bias 
(which we call non-Gaussian halo assembly bias) can be factorized in a form 
approximately independent of redshift and halo mass. The correction to the non- 
Gaussian halo bias due to the halo formation history can be as large as 100%, 
with a suppression of the signal for recently formed halos and enhancement for 
old halos. This could in principle be a problem for realistic galaxy surveys if 
observational selection effects were to pick galaxies occupying only recently formed 
halos. Current semi-analytic galaxy formation models, for example, imply an 
enhancement in the expected signal of ~ 23% and ~ 48% for galaxies at z = 1 
selected by stellar mass and star formation rate, respectively. 



1. Introduction 

Placing constraints on deviations from Gaussian primordial fluctuations offers the 
possibility to test inflationary models [H [2] and probes aspects of inflation (namely 
the interactions of the inflaton) that are difficult to probe by other means. 

In this paper we focus on the so-called local non-Gaussianity, which describes 
inflation-motivated departures from Gaussian initial conditions and is parameterized 
by [3 II 111]: 

$ = + /nl(</'' (1) 

Here (f> denotes a Gaussian field and $ denotes Bardeen's gauge-invariant 
potential, which on sub-Hubble scales reduces to the usual Newtonian peculiar 
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gravitational potential, up to a minus sign. The parameter /nl is the amplitude of 
the non-Gaussian correction; since (j) ^ 10~^ and current observational limits restrict 
|/nl| < 100 [71(8], we are considering corrections of order 10 

Recently, Refs. [HI HH] showed that primordial non-Gaussianity affects the 
clustering of dark matter halos (i.e., density extrema), inducing a scale-dependent 
bias for halos on large scales. The strong scale-dependence of halo bias (oc 1/fc^) 
predicted for non-Gaussianity of the local type |9j can provide constraints on /nl 
competitive with those available from the Cosmic Microwave Background O [HI [12] . 
Analytic estimates of the amplitude of the scale-dependent bias show good agreement 
with results from TV-body simulations [9l [131 [HI [H] • 

Slosar et al. (2008) [7] argue that the amplitude of the non-Gaussian halo bias 
should depend on the halo merger history. Motivated by the idea that quasar 
activity is triggered by recent mergers, they estimate the amplitude of the non- 
Gaussian halo bias for recent mergers. In this paper we extend their reasoning to 
a more general dependence on the halo merger history through the halo formation 
redshift zj. We compare this analysis with the dependence of the non-Gaussian 
halo bias on halo merger history detected in the A''-body simulations of Grossi et 
al. (2009) [13]. By comparison with the halo merger history dependence of the halo 
occupation distribution of certain galaxies in semi-analytic models of galaxy formation, 
we estimate the possible impact of these results on predictions for the amplitude of 
the non-Gaussian bias in upcoming large scale structure surveys. 

This paper is organized as follows. In Section 2 we first revisit the extended Press 
Schechter non-Gaussian halo merger bias derivation of Ref. [7] and then generalize 
it to arbitrary halo formation redshifts. In Section 3 we detect the effect in A^- 
body simulations and show the agreement with the analytic description, including 
the simple halo mass and redshift dependence predicted by the model. We explore 
the consequences for practical determination of /nl in Section 4 and we conclude in 
Section 5. The appendix presents our methodology for fitting our simulation results for 
the amplitude of the non-Gaussian halo bias mode by mode, i.e. without computing 
a binned power spectrum. 

2. Theory 

It has been shown [16 that for non-Gaussianity of the local type considered here, the 
bispectrum is dominated by the so-called squeezed configurations, triangles where one 
wavevector length is much smaller than the other two. In other words, local non- 
Gaussianity introduces strong coupling between large and small scales. It is this 
coupling that alters halo clustering on large scales (and the halo mass function). 
In the peak-background split framework, for Gaussian initial conditions, the short- 
wavelength modes of the density field are responsible for halo collapse and virialization, 
while the long wavelength ones modulate halo counts. The Lagrange bias of halos of 
mass M at redshift Zo relates their number density (in Lagrange coordinates) to the 
long wavelength matter overdensity field 5;(x) at redshift ZqI 

n,,(M,Zo,x) = n(M,Zo)(l + 6L(M,Zo)(5;(x,Zo)). (2) 
Upon rearranging this equation, the large scale Lagrange halo bias for halos of 
mass M arising from Gaussian initial conditions is related to the halo number density 
as 

, ^ 1 dn 1 dn 
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because in the Gaussian case the effect of modulating the density field by a long 
wavelength mode 5i in some volume of the Universe can be rexpressed simply as an 
additive change in the effective critical density for collapse S^. in that region. In the 
presence of mode coupling due to primordial non-Gaussianity, large-scale modes affect 
the statistical properties of small-scale modes (and vice- versa). For special types of 
non-Gaussianity (i.e. the local case) it is possible to generalize the Gaussian peak- 
background split derivation of halo clustering to non-Gaussian initial conditions. This 
was done in Slosar et al. (2008) [7], which we now summarize. 

2.1. Review of Slosar et al. (2008) theoretical results 

Slosar et al. (2008) [7] re-cast previous results on non-Gaussian halo bias [9l [TOl [17] 
by instead using a peak-background split of the Gaussian field (j) in Equation [TJ where 
long and short wavelength modes are independent: 

= 0i+0s. (4) 

The long wavelength density and potential fiuctuations are related by the Poisson 
equation, which can be expressed in Fourier space using Si{k,Zo) = D{zo)M{k)^{k) 
with 

where T{k) is the transfer function and D{z) — g{z){\ + z)^^ is the linear growth 
function normalized to (1 + z)^^ in the matter-dominated epoch. That is, g(z) is the 
growth suppression due to non-zero A, for which g{zcMB) ~ 1 and g{z — Q) ~ 0.75 
in a concordance cosmology. We note here that while in this paper ^(A;) refers 
to the potential in the matter dominated epoch, other authors (e.g., Ref. [13]) 
have chosen to work with $(fc) normalized at z = 0, the "LSS convention." The 
gravitational potential depends on redshift in a non Einstein-de Sitter universe: 
$(fc) = i^{k)g{z — 0)/ g{zcMB)- We can see from Equation [1] that /nl also depends 
on this choice, so that f^l'^ = f^l'^ g{zcMB)/ g{z = 0) w 1.3 f^l'^. Throughout 
this section, /nl refers to /nl^^- 

The effect of the non-Gaussianity described by Equation [T] (and its induced mode 
coupling) is to modulate the amplitude of small-scale density fluctuations 6s with the 
long wavelength potential fluctuations. This can be viewed as a change in the local 
value of <7s, crg°'^''\ due to </>;: 

SsiZo) = D{Zo)M{k) [(1 + 2/NL</'i)'/'. + /NLf/)'] . (6) 

In this picture, the Lagrangian halo bias becomes 

=,) = .£(«,=,) + 2/„.^J^. ,7) 

The final expression for the non-Gaussian scale-dependent component of the halo bias 
on large scale is 

, ^ 2/nl d\nn{M,Zo) , , 

A6NG(M,fc,z„) = gi,,^ . (8) 

where we have dropped the "local" label. That is, under the assumptions of [7], the 
non-Gaussian scale-dependent halo bias can be predicted from determination of the 
dependence on erg of the mass function of the desired objects in cosmologies with 
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Gaussian initial conditions. In particular, to determine A6nq for halos of mass Mq 
that have undergone a recent merger, they write 

a In (Mo, zo) ^ d In n{Mo, Zp) ^ 5 In P{Mi\Mo,Zo) 

i9 In CTs d In as d In cts 

where Mi is the progenitor mass for a halo of mass Mq that has undergone a recent 
merger, and P(Afi|Mo, Zq) is the probability that a halo of mass Afo at Zo has a recent 
progenitor of mass Mi. For a universal mass function, the first term evaluates to 
dch'f (or qSch'f [13] )■ Using the extended Press-Schechter (ePS) formalism, the second 
term evaluates to -1 (independent of Mi), in good agreement with the dependence of 
merger rates on (Tg found in A'^-body simulations during the matter-dominated epoch 
[7]. Ref. [H] finds that in the ePS formalism, the Gaussian halo bias is independent 
of its formation history. While "halo assembly bias" is the subject of a lot of current 
theoretical effort (e.g.,[T5j[^ and references therein) it appears to be substantial only 
at the lowest halo masses. A^-body simulations show that the dependence of halo bias 
on secondary paremeters is relatively small for the massive halos on very large scales 
of interest in this work f^H [21 [H [Ml HOI Hi] . In particular, Ref. [M] find that for 
M > lOAf^, the 20% youngest halos have a 10% larger Gaussian halo bias than the 
20% older halos. Here we concentrate instead on the formation history dependence of 
the non-Gaussian correction to the halo bias, which, as we show below, is much larger. 



2.2. General dependence o/ A6ng on Zf using ePS 

In the formalism of ePS, we can easily generalize the results of [7] to express the 
dependence of A5ng on the halo formation history, which we define by the "formation 
redshift" . In the original formulation of ePS i26j, Zf is the redshift at which the halo 
contains half of its current mass; on the other hand Ref. [37] suggests that, at least for 
the observational properties of clusters, this should rather be defined as Mf = /Afg, 
with / ss 0.75. In this section we leave / as a free parameter. We can therefore write 
A6ng explicitly in terms of the halo mass M observed at redshift zq with formation 
redshift z/, and fraction of the halo mass at z/, /. 

A6ng(M, fc, 2o,^/,/) = 

2/nl ( d\nn{M,Zo) 8 In P.^jfM, Zf\M, zo) \ 
D{zo)Mik)\ din as dlnas J ^ ' 

Here Pz^ is the conditional mass function - the probability that a halo with mass 
M at Zo has a mass fM at an earlier redshift between z/ and z/ -I- dzf. With the 
same approximations as in Sec 2.5.2 of [26], but generalizing to arbitrary fraction / 
defining the epoch of formation, an analytic expression for the probability distribution 
of formation redshifts Pzj:{fM,Zf\M,Zo) can be derived as follows. We start by 
defining the amplitude of fluctuations in the linear density field evolved to z — 0, 
as usual: 



where Wm is the Fourier transform of a top-hat filter with radius R — (3Af/47rp,„)-^/'^, 
pm denotes the matter background density at z = 0, and P(fc) is the linear matter 
power spectrum at z = 0. The second equality makes explicit how we define the as in 
Equation riOl that we differentiate with respect to. That is, as defines the amplitude of 
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a'^{M), while the mass dependence is held fixed. Following Ref we also introduce 
the quantity 

At fixed final halo mass M at Zo, the probability that the halo formed between z f and 
Zf + dzf is simply expressed in terms of w/ : 

Pzfdzf = Pcjj'^^dzf = 
azf 



^ ^ ^ - 2^ e-^?/2 - 20ji Q - 1^ Erfc(%/V2 



where 5c{z) = Ac{z)D{z)/ D{z — 0) (with Ac{z) ~ 1.686) is the critical overdensity 
for collapse at redshift z and Erfc denotes the complementary error function. The 
halo mass, redshift, and as dependencies of Pzj are absorbed into the variable w/. To 
compute the second term in Equation [TOl we must differentiate the formation redshift 
probability distribution Pz^ at fixed halo mass M with respect to as- 

d\nPzfifM,Zf\M,Zo) _ ^ dPc,, ^^^^ 

miners Pcjf duif 

Therefore, the ePS formalism predicts that the amplitude of the merger history 
dependent contribution to A6ng depends on M, Zg, and Zf only through a single 
variable, w/. Note that Equation [TH approaches Q'j — 1 in the limit of large uif. In 
Section [31 we test Equation [TJ] explicitly by dividing our simulated halo sample into 
bins in Cjf. Correspondingly, we must average Equation 1141 over the same bins. To 
reduce the impact of known discrepancies between the ePS prediction for Pz^ and 
those measured in A^-body simulations (e.g., [28]), we express the ePS predictions in 
terms of the fraction x of halos with the lowest or highest values of a)/. For the lowest, 
we set uJi — and solve for uj2 such that 



CJ2 







dLOfPcjj = X . (15) 



For the highest values of w/, we set u)2 = oo and we solve similarly for uji. The final 
expression for the mean value of Equation [TJj in the range [Cjiix)., uj2{x)\ as a function 
of halo fraction x is 



d\nPz,ifM, Zf \M, z,y 
d In as 

Htjd^fp^. (-1-^^) _ [-^./^j:;:; 

rli2(2;) (-11.2 (x),- 



(16) 



One may worry that the expression for the conditional probability (Equation I13p 
was derived within the Press-Schecter [29) framework: an approximation yielding an 
expression for the halo mass function which does not reproduce well N-body simulation 
results especially at small and big masses and which has been significantly improved 
(e.g., (30l [31]). Unfortunately there is no analytical expression for the conditional 
probability in the context of these improvements. However, van den Bosch et al. [28] 
found relatively good agreement between Equation [T3] and their TV-body simulation 
results, though for the massive halos of interest in this work, the simulated halos 
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halo mass (M^^^^Jh) 



Figure 1. Each point in tlie figure represents a halo in our mass-limited 
A/ > 2 X W^^h~^ Mq sample at z = 0. Each color band represents a subsample 
of the full distribution that reproduces the mass function of the full sample but 
has different formation redshift distributions; each subsample contains 10% of the 
total hales in the sample. The black lines dividing two colors define our cumulative 
samples. For instance, all halos below the third black line from the bottom of the 
plot enter our x = 0.3 "low Zf" sample, while all halos above the third black line 
from the top enter our x = 0.3 "high zj" sample. 

formed somewhat earlier than Equation [T3] predicts. Moreover, here we are only 
interested in how dP/dzf changes as Cg is varied; we therefore expect ePS to fare 
better in this respect. We will show in section that the ePS approach adopted here is 
a remarkably good description of the dependence of dP/dzf on erg as measured from 
N-body simulations. 

3. Simulation Results 

We use the Grossi et al. (2009) [l3] set of 5 simulations (/^f = 0, ±100, ±200, 
or equivalently, f§f^^ = 0, ±75, ±151), where all simulations use the same initial 
condition field (j) in Equation [1] to suppress cosmic variance. These simulations have 
Lhox = 1200 h^-^ Mpc and particle mass nip = 1.4 x 10^^ h~^MQ. We generate halo 
merger trees at z = with the SUB FIND code [32], based on simulation outputs at 
z = (0, 0.137, 0.283, 0.441, 0.613, 0.804, 1.017, 1.258, 1.535, 1.857, 2.236, 2.688, 3.235, 
3.907, 4.749, 5.822). To define a halo's formation redshift Zf, we interpolate between 
the progenitor masses at the available redshifts. 
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Figure 2. Halo-halo power spectra from the /^l^ = —200 (dashed) and 
/LSS = 200 (solid) simulations for halos with M > 2 X W^H-^Mq for 25% 
subsamples of halos with the highest (green) and lowest (blue) zj values. No shot 
noise correction has been applied, and so for comparison we produce a random 
subsample of the full mass-limited sample with the same number density as the 
other samples (red). The left panel uses halos at 2 = 0, while the right panel uses 
halos at 2 = 0.8. 



The first term in Equation [TU] is proportional to the Gaussian Lagrange halo 
bias 6^, which depends on halo mass. To separate the dependence of A6ng on 
halo formation redshift from its dependence on halo mass, we create z/-dependent 
subsamples of the full mass-limited halo sample at fixed observed redshift Zo that 
match the halo mass function of the full sample. To do this, we sort the A^haios halos 
by mass, and form groups of -/Vgroup ^haios halos closest in mass. We sort these 
-^group halos by z/, or equivalently by <D/ (since the halo mass is nearly constant in 
the A^group halo subsample). The highest and lowest fraction x of these iVgioup halos 
enter the z^-dependent subsamples that by design have matching mass functions. 
The fuU Af > 2 X IO^/i^^Mq sample has TVhaios ~ 250,000 at z = 0, while we 
use A'gioup = 10, 000 throughout the main text; in the Appendix we demonstrate 
that our results are unchanged if iVgroup = 100 is used instead. Figure [1] illustrates 
the sample selection scheme more clearly, where we plot each halo in our z = 0, 
M > 2 X \0^^h~^MQ sample in the two dimensional space Zf-M, with the formation 
in this case defined by M{zf) = fMo for / = 0.5. Note the relatively mild trend that 
the average Zf decreases with M, and also the large spread in z/ at fixed M. At each 
small halo mass bin defined by A^gioup halos, we divide the sample into 10 bins based 
on z^, and the Zf bins from different mass bins are combined to produce halo samples 
with matching mass functions but different formation redshift distributions. The 
result is 10 distinct samples represented by the color bands in Figure [T] To increase 
the signal to noise ratio of our measurements, we do not present non-Gaussian halo 
bias measurements for the 10 disjoint samples represented in Figure [1] but instead 
present results for cumulative samples defined by the 9 lines which each divide two 
colors in the figure. That is, samples labelled as x = 0.3 contain either the three lowest 
or three highest color bands in the figure; the black lines divide the different samples. 
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Figure m illustrates the signal we quantify in this section. We plot the halo-halo 
power spectrum for halos with A/ > 2 x IO^'^/i^^Mq at z = (left panel) and z = 0.8 
(right panel) for two subsamples of the parent sample with the 25% highest (blue) and 
lowest (green) formation redshifts Zf for / = 0.5, selected as described in the previous 
paragraph. For reference we plot a random subsample of the halo population with 
the same number density as the z/-dependent subsamples in red. This should have 
approximately the same noise properties as the other two samples, but reflects the 
clustering properties of the full halo population. Dashed lines are for fl§^ = —200 
and solid lines are for /nl^ — 200. In the region of the spectrum where A&ng is 
important (restricted to k < 0.03 /i/Mpc in our analysis), the power spectrum is 
noisy but appears to depend on z/. Because the spectra show good agreement at 
higher k where A&ng is small, our scheme to match the first term in Equation 1101 
between samples is successful, and the difference between the spectra at small k can 
be attributed to the second (z/-dependent) term in Equation[TUl rather than the term 
proportional to the Gaussian Lagrange bias 

We assume the following relation between halo ((5;i(k)) and dark matter ((5m(k)) 
individual Eulerian density modes observed at redshift Zo, in order to fit for the 
amplitude of the non-Gaussian halo bias, Ang^ 



Here = 1 + fe^ represents the Eulerian scale-independent contribution to the halo 
bias (first term in Equation!?]). Ang describes the amplitude of the non-Gaussian halo 
bias, and Equation [TU] predicts that Ang is the logarithmic derivative of the Gaussian 
mass function of the halo sample with respect to erg, i.e., Ang should be given by 
d In n(M, Zo)/d + d In P^j {fM, Zf\M, zo)/d In as- We assume the noise n(k) 

to be Poissonian. For the results presented in this section, we first use the f^l^ — 
simulation to determine the scale-independent halo bias be for each mass, redshift, 
and z/-dependent halo subsample while holding Ang = 0. We then assume the /nl 
and k dependence in Equation 1 171 fit the four simulations with non-zero /nl for a 
single number, A^q, for each halo subsample. This accounts for any small dependence 
of be on Zf in the Gaussian case. In the Appendix we present further details of this 
fitting procedure and compare this approach with a more conservative one, where 
separate values of bo are fit simultaneously for each /nl when fitting for A^q. 
An advantage over previous approaches is that our fit is performed mode by mode for 
the 366 available modes, rather than to a power spectrum where an effective value of 
k must be chosen for each bin; this choice seemed to impact the results of Ref. |14| . 
Note that in the model given by Equation 1171 cr\^^ oc l/iVhaios, so our measurement 
error is smallest at z = where there are the most halos above our fixed mass limit 
available from our simulations. 

Equation [16] predicts that, when written as a function of the variable x, the Zf- 
dependent fractional contribution to A6ng is independent of both halo mass M and 
observed redshift Zo', we will check these predictions explicitly later in this section. We 
begin with a mass-limited halo sample M > 2 x 10"'^^ Mq at each snapshot redshift 
Zo available from our simulations and measure the amplitude of its non- Gaussian bias 
term, A^'q(zo). To quantify the dependence of A^q on z/, we measure IS.A^q{x, Zq): 



where x is the fraction of halos with the highest (lowest) formation redshifts entering 




(17) 



Aj4jg-Q(a;, Zo) — A-Jq(x., Zq) — A" 




(18) 
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Figure 3. Aj4j^'q(x) = A-^q{x) — A^q averaged over simulation snapshots 
between z = and z = 2.23 for subsamples of M > 2 X lO^"^ Mq halos 

of the highest (lowest) fraction x of halo formation redshifts (black points with 
errors). The left panel uses / = 0.5 to define the formation redshift z^, while 
the right panel uses / = 0.75. The ePS prediction fEguation I16I I for the high Zf 
(red) and low Zf (green) subsamples provides an excellent fit to the simulation 
measurements. 



the h {I) subsamples at each Zq. In Figure [3] we plot 1^^q{x) after performing an 
error-weighted average over all Zo values between z = and z = 2.23 for two definitions 
of the halo formation redshift, / = 0.5 and / = 0.75. Note that ^A^q{x) is positive, 
while A^}^q(x) is negative for both values of /. Conservatively, we show the errors 
from the Zq = sample only, since the halo samples at different redshifts will be 
correlated. Moreover, note that for two values of x, xi < X2, the first subsample is 
contained in the second. Therefore, the error bars at different x are highly correlated 
as well. The agreement with the ePS prediction given by Equation [16] is excellent for 
both the low Zf (green curve) and high zj (red curve) subsamples. Note that the ePS 
prediction is: 

^^NG ^-j -1 - "5 — -T^- (.lyj 

a m (78 r^oij du! f 

For the finite binning used in the simulations, the plotted theory line uses Equation 

m 

The essential features of the non-Gaussian halo assembly bias we quantify here 



are: 



The left panel of Figure|3]shows how the relative importance of the first and second 
term in Equation 1101 evolves with redshift, for our mass-limited halo sample; the 
amplitude of the z^-dependent contribution is potentially large compared to the 
first term in Equation [TOl d In n{M, Zg) /d In as ^ 1.68(6g — !)• For instance, 
if we split the z = 0, M > 2 x 10^"^ h^^ Mq halo sample in two as a function 
of z/ (i.e., X = 0.5), the Ang predictions differ from the mean {Af^Q = 1.12) 
by ±(1.28 ± 0.2). That is, the more recently formed halos have a factor of 
^ 7 smaller expected signal than the full halo sample (and with opposite sign: 
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Figure 4. Lejt panel: Best fit j4ng for the full mass-limited sample M > 
2 X 10^^ Mq (points with errors) as a function of redshift, as well as for 

subsamples split in half on z-f {x = 0.5) for / = 0.5 (dotted) and / = 0.75 
(dashed). The zy-dependent correction is well described as an additive correction 

as in Eguation llOl Right panel: |aJ^q(x = 0.5) — A^'J^j as a function of redshift for 
/ = 0.5 (diamonds) and / = 0.75 (squares). Points are offset from the snapshot 
redshift by ±0.02 for clarity, and the ePS predictions are shown as a straight lines 
for / = 0.5 (dotted) and / = 0.75 (dashed). There is no clear redshift dependence, 
though the errors are large so our measurement is not very constraining. 



^ng/^ng — —0.16/1.12), while the older halos have a factor of ~ 2 larger signal 
than the full halo sample (A^q/A^\. ~ 2.3/1.12). 

• The effect is asymmetric between old and young halos. Even if the tracer 
population excludes only the 10% oldest halos, the value of Ang for the remaining 
90% of the halos differs from the full sample by sa 0.44; at z = 0, this amounts 
to a change of 0.44/1.12 = 40%. 

• Figures [4] and [5] support the ePS prediction expressed in Equation [T6l that there 
is no mass or redshift dependence to the ^/-dependent term, when expressed in 
terms of the variable x. However, note that we are restricted to studying massive 
halos M > 2 X 10^'^ Mq, and that our errors on AAng rapidly increase with 
z. 

Finally, we generate the closest possible sample to recent major mergers available from 
our simulations at z = 0, where the errors on AAng are smallest. We sort the halo 
sample not on Zf but on the first progenitor's mass at the previous snapshot output, 
z = 0.137 (1.7 Gyr earlier). For x < 0.4 (i.e., for the 40% of halos with the lowest 
progenitor masses) we find a AA-[<sq{x) consistent with —1, the ePS prediction derived 
in Slosar et al. 2008 [7], and the limit of our ePS predictions for large values of /. 

4. Implications for upcoming galELxy surveys 

Non-Gaussianity constraints achievable from forthcoming and planned surveys using 
halo bias are very promising [Til ES]; these forecasts are obtained using the mean 
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Figure 5. The average profile measured using all halos witli Af > 2 X 
10^"^ Mq (black points with error bars, same as in Figure [Sj plotted with 

three subsamples in mass containing equal numbers of halos. The blue curve 
uses halos with M £ [2.0,2.82] X lO-"^^ Mq, the green curve uses halos 

with M G [2.82,4.88] X 10^^ Mq, and the red curve uses halos with 

M > 4.88 X 10^=* h-^ Mq. For the mass range probed here, AAjmg is independent 
of mass. 



non-Gaussian halo bias relation and therefore assume that these surveys will select 
galaxies that provide a fair sample of the underlying dark matter halo population in 
the suitable mass range. However, if the survey selection were to preferentially select 
galaxies occupying dark matter halos with zj lower than the mean or were to miss 
e.g., the 10% of dark matter halos with the highest Zf, this, if unaccounted for, would 
introduce a significant bias in the measured /nl parameter. As shown in Section 3, in 
principle, for some extreme cases, this bias could be as large as 40-100% of the (mean) 
signal. 

As alarming as this may seem, it is important to bear in mind that the results 
of the previous section only affect the predictions of A6ng for galaxy redshift surveys 
if the probability of a halo of hosting a galaxy in the survey sample depends on 
the formation history of the halo, at fixed halo mass M . Within the halo- model 
framework [34] . the standard implementation of the halo occupation distribution 
approach describes the bias of specific galaxy types with respect to the underlying 
dark matter by assuming that the probability for a halo to host a galaxy depends only 
on the halo mass. While for many galaxy types the impact of secondary parameters 
appears small (e.g., [3511311133 references therein), there are some indications of 
halo-assembly bias and evidence for secondary parameters [351 IMl SO] ■ Semi-analytic 
models of galaxy formation are dependent on the entire dark-matter halo-merger 
history and can, in principle, include the full dependence of galaxy properties on 
secondary parameters. Ref. [41] assessed the impact of Gaussian halo assembly bias 
for JVIillenium simulation galaxy samples with two different magnitude cuts, and found 
changes to the Gaussian bias of < 10%. However, this change was not explained by 
the addition of any simple secondary parameter, like 2:/=o.5 or halo concentration. 
Therefore, by considering z/ in what follows, we may be underestimating the possible 
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signal from other properties of halo formation that correlate more closely with galaxy 
properties. 

In this section we use galaxies selected from the Bertone et al. (2007) [H] 
semi-analytic galaxy-formation model implemented on the Millenium simulation halo 
merger trees [32]. In order to estimate the order of magnitude of this effect, we consider 
realistic but simply selected galaxy populations relevant to upcoming galaxy surveys, 
and ask whether these galaxies occupy dark matter halos with specific formation 
histories, in such a way to introduce a marked non-Gaussian halo assembly bias 
contribution. Note that the same procedure can be followed to study the impact 
of halo assembly bias on galaxy clustering in Gaussian initial conditions; in fact, in 
both cases, we only need to quantify how different the formation redshift distribution 
of the halo population hosting the selected galaxies is from the full halo population, 
for a given halo mass M (or small dM around M). 

The procedure in the context of a mock galaxy sample embedded in an A^-body 
simulation is as follows: 

• Select a galaxy sample (e.g., based on observational selection criteria) and identify 
their host dark matter halos {host halo sample). 

• Identify the host dark matter halos' mass range and select the full dark matter 
halo sample in that mass range {full halo sample). The halos containing the 
galaxy sample are a subset of this set. 

• Measure the formation-dependent quantity of interest, such as z/, for the full halo 
sample. 

• Determine the dependence of the Zf distribution on halo mass M; in practice we 
do this exactly as for the sample shown in Figure [TJ That is, we determine the 
dividing lines between 10 bins in Zf (black lines in Figure [1]) as a function of halo 
mass. 

• Determine how the halos containing the galaxy sample occupy these 10 bins, 
summarized by Pgai(ybin); here we use ybin to denote the bin number or transversal 
band in Figure [TJ j/bin = 1 corresponds to halos with 10% lowest Zf for each mass 
bin and the ybin = 10 corresponds to halos with the 10% highest z/ for each mass 
bin. If galaxies were hosted in a random sample of the full halos, P(ybin) would 
be a constant. Any deviation from a constant indicates a correlation between 
the galaxy selection and the host halo formation history. We have chosen the 
normalization such that X^bins ^(j/bin) = 1- 

• Determine the halo assembly bias correction factor. For our application this is 



To evaluate A^^''^, in practice we break the halos into 10 bins as shown in 
Figure [1] and sum the expected signal, AANG(2/i)i O'^'^'" the fraction of galaxies in 
these discrete bins, Pgai{yi)- We compute AA^Q{yi) using Equation [TBI For 
this application we are considering disjoint halo subsamples rather than the 
cumulative bins considered in Figure [3| In the right panel of Figure [6| we show 
the theory curves for 10 disjoint bins, where we have used / — 0.5 (/ = 0.75) to 
define Zf in the solid (dashed) curve. 
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AAnG{y)PgAy)dy « ^ANG{y^)PgAy^) ■ (20) 



i=l 
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Non-Gaussian assembly bias in the Bertone et al. (2007) mock galaxy catalogs 
n (/i^iMpc)"^ selection criteria AAff^ (5c(6g - 1) « vI^q % change 

4.5 X 10"" > 8 X lO^'^ h-' Mq (151 2^3 2^2 23% 

4.5 X 10-4 > 24 Mo/yr 0.24 1.3 0.51 48% 

Table 1. We select two galaxy samples from the semi-analytic model of Bertone 
et al. (2007) |42l . which have been run on the Millenium simulation I32| . Both 
samples are selected to have the same number density, and roughly the value that 
would be targeted for upcoming BAO surveys. The first sample is selected based 
on stellar mass, and the second based on star formation rate. Both galaxy samples 
preferentially occupy halos that formed earlier than average (see Figure IBJ, which 
translates into non-zero values of AA^j^ . We measure the Gaussian Eulerian bias 
6q from the galaxy clustering amplitude in the Millenium for both samples in 
order to infer the expected Anq for each sample in the absence of non-Gaussian 
halo assembly bias. 



If for some sample galaxies AA|jq is not neghgible compared to ~ 1.68(6g — 1), 
then the effect of non-Gaussian assembly bias cannot be ignored. In particular recall 
that for recently formed halos, AAng can even cancel out Angj erasing any non- 
Gaussian signature, if present. 

BigBOSS [13] plans to select luminous red galaxies out to z ~ 1 and emission-line 
galaxies at higher redshift; other proposed surveys (e.g., Euclid [Hj) will also target 
emission- line galaxies out to z ~ 2. Emission-line galaxies are thought to have high 
star formation rates, possibly triggered by mergers. Should galaxy mergers trace the 
host dark matter halo mergers, this selection effect could greatly reduce the expected 
signal for a given value of /nl- Another large future survey suitable for this study is 
LSST; LSST will select all galaxies above a given magnitude cut, and thus its selection 
criterion should be less correlated to the host halos accretion history than the other 
two surveys. 

As an example, we consider two distinct galaxy samples from Ref. 'A2^ semi- 
analytic catalogs at z = 0.99, one selected on large stellar mass (Mstciiar > 8 x 
lO""^*^ Mq) and the other on large star formation rate (Afstciiar > 24MQ/yr), where 
both samples are chosen to have number densities of 4.5 x lO^'' (Mpc/ft,)"^, i.e., in 
the right ballpark for a B AO-focused survey. Pga,\{y) for these samples is shown by 
the solid green and red dashed lines in Figure [B] for stellar mass and star formation 
rate selected samples, respectively; the figure is normalized such that Pga,\{y) = 0.1 
corresponds to a uniform sampling of the underlying halo distribution. For both 
samples, galaxies occupy halos across the distribution of z/, but with a preference for 
halos that formed early (high j/bin), though the trend is stronger with stellar mass. 
Using Eq. [20l this preference translates to A^^^q = 0.51 for the stellar mass selected 
sample, while their Gaussian bias is ~ 2.3, as measured from their power spectrum on 
large scales. For the star formation rate-selected sample, AA|j^ = 0.24, while their 
bias is ~ 1.3. Since A^\^ k, Sdbc — l)j accounting for the non-Gaussian assembly 
bias amounts to a boost of the expected scale-dependent halo bias of 23% and 48%, 
respectively. Results for these galaxy samples are summarized in Table [TJ 
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20 40 60 80 100 




Figure 6. The solid (dashed) black curves in the right panel show AAng(?/) 
for 10 bins in Cbf for / = 0.5 (/ = 0.75) using Equation 1161 Each bin contains 
10% of halos, in contrast to the theory curves in Figure [S] where the "high" 
and "low" curves considered cumulative samples (i.e., 30% lowest and highest). 
The solid green curve in the left panel shows Pg^i (j/) for the stellar mass selected 
sample described in the text (where Zf has been defined using / = 0.5), while 
the dashed red curve shows Pgai(?/) for the star formation rate selected galaxies 
(where zj has been defined using / = 0.75). The normalization is chosen such 
that Pgai(3/) =0.1 represents a sampling of the underlying halo distribution which 
is independent of halo formation redshift Zf. The integral over the product of the 
solid (dashed) curves gives the assembly bias contribution to the non-Gaussian 
halo bias f Equation I20I I for the stellar mass (star formation rate) selected galaxy 
samples. The result is AA^'^q = 0.51 for the stellar mass selected sample, and 
Aj4ng = 0.24 for the star formation rate selected sample. See Table [T] for more 
details. 



5. Conclusions 

We have demonstrated that the impact of assembly bias on the amplitude of the non- 
Gaussian halo bias can be quite strong. We have expanded the arguments in Slosar et 
al. (2008) fT" using extended Press-Schechter theory to express non-Gaussian assembly 
bias in terms of halo formation redshift Zf for arbitrary / > 0.5, where Zf \s the redshift 
at which a halo has accreted a fraction / of its final mass. This theory predicts that 
halo subsamples containing a fraction x of the earliest (latest) forming halos (compared 
with other halos with the same mass) have a non-Gaussian halo bias that differs from 
the full parent halo sample by a fractional correction dependent only on x; when 
using this variable, the non-Gaussian assembly bias correction is independent of halo 
mass and redshift. The iV-body simulations of Grossi et al. (2009) (13j are in good 
agreement with these ePS predictions. 

The implications of these results for galaxy redshift surveys are extremely 
uncertain. If the commonly adopted assumption that the probability of a halo hosting 
a particular type of galaxy only depends on the halo mass, then there will be no 
non-Gaussian halo assembly bias contribution to the galaxy sample's non-Gaussian 
bias. However, in principle, galaxy formation depends on the entire history of host 
dark matter halos to some degree, and semi-analytic models of galaxy formation 
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attempt to account for this dependence. In Section UJ we found that a relatively mild 
preference for early-forming halos for both stellar mass and star formation rate selected 
z = 1 samples translates into an increase in the expected non-Gaussian galaxy bias of 
~ 23 — 48% compared with the average signal expected from the samples' Gaussian 
bias values. This result is particularly counter-intuitive for star-forming galaxies, since 
star formation is triggered by galaxy mergers in these models. One should bear in mind 
that a galaxy merger does not necessarily correspond to a major merger of the host 
dark matter halo, and that it is reasonable to expect some time-lag between the dark 
matter halo merger and the merger of the galaxies populating them. Furthermore, we 
caution that we have not been extensive in our exploration of galaxy sample selection 
space, or precise enough to make predictions for upcoming experiments. There may 
be certain populations for which this effect may be much larger or much smaller. On 
the other hand, it is possible that in an analysis aimed at constraining /nlj one may 
be able to weight galaxies by some color or spectral property in order to enhance the 
non-Gaussian signal in the survey. More work is needed to further quantify the impact 
of such an approach on the recovered constraints on /nl from realistic surveys. 
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Appendix A. Fitting non-Gaussian bias 

We wish to construct a to estimate the ampUtudc of halo assembly-bias in non- 
Gaussian simulations; in principle this would require knowledge of a 4-point function 
in the non-Gaussian theory. We begin considering the expected errors in linear theory 
for Gaussian initial conditions, and linearly biased tracers that Poisson-sample the 
continuous matter density field. Our model for the relation between the halo density 
field and the matter density field for each mode k is given by Equation \17\ and 
Poisson sampling implies (n(k)n*(k)) = n^^. We will keep the fc-dependence of the 
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non-Gaussian component of the bias fixed, A^~^(fc), and fit for an amplitude of the 
non-Gaussian component, Anqi ^nd scale-independent component, hQ. Under these 
assumptions, the variance about the model is 

KS.n- (bo + -^NO n^^w'^^^V -'A.')'') = (>i(k)n'(k)),S,*„<„. (AJ) 



'D{zo)M{k) 

Here the average is over the Poisson noise, and the matter field is considered fixed. For 
our calculation, we will only consider the real component of d*^Sm- Since the noise 
is assumed uncorrelated with 6^, the noise associated with only the real component is 
smaller by a factor of 2 than in Equation lA.ll which counts both real and imaginary 
components. 

Vynm{k)/2n 

Because there may be a slight dependence of he on /nl, we consider two distinct 
fitting procedures for Ang ■ In both, we assume the k and /nl dependence in Equation 
1101 and define one and two sigma errors by changes in the value of 1 and 4 
in Equation IA.21 where the sum is performed over the density modes of interest 
in each different /nl simulation. Following Ref. [13], we limit all fits to modes 
with k < 0.03/i/Mpc, which after discarding modes with wavenumber 27r/Lbox and 
^/2 X 27r/Lbox, amounts to 366 modes. We find that for the massive halos considered 
here, can be substantially smaller than the number of degrees of freedom (see 
|45[ I46j for other recent evidence for sub-Poissonian sampling). Therefore, our error 
bars may be overestimated. 

In the first approach applied in the main text, we first fit the model with Ang ~ 
to the /nl = simulation to determine be and its uncertainty for each mass, redshift, 
and z^-dependent halo subsample. Next we fit for Ang using the four /nl 
simulations, assuming that the scale-independent contribution, be, is independent of 
/nl • Therefore we use the measurement in the /nl = simulation to marginalize over 
a be common to all values of /nl, integrating over the probability distribution Pibc) 
derived from the /nl = result. 

In the second, more conservative approach, we fit our simulation data to a five 
parameter model: the scale-independent ba in each non-zero /nl simulation, and a 
single amplitude Ang for the non-Gaussian signal. Because of the limited range of 
k values over which to fit, the non- Gaussian amplitude Ang can be degenerate with 
the scale-independent bias be in Equation [TTl Figure lAll shows an example of this. 
We first fit the /nl = simulation for 6gjjjl=o in the same k range we use to fit 
the non-Gaussian bias {k < 0.03/i/Mpc), holding Ang = 0. To illustrate the non- 
Gaussian signal, we plot the halo-matter cross power spectrum Vhm normalized by 
6G,/NL=o^mm for /^^ = 200,100,-100,-200 (blue, green, red, light blue). We plot 
the best fit five parameter model in black for k < 0.03/i/Mpc, and we plot the value of 
bo for each /nl value at fc > 0.03/i/Mpc. The best fit values of be are anti-correlated 
with the value of /nl in the z = 0.8 sample, which could in principle affect the best 
fit value for Ang- We compare the results of the five and one parameter fits to ^ng 
in Figure E2I The two approaches give consistent results, though the errors are much 
larger as expected when ba is fit separately for each value of /nl- 

In Section [3] we introduce the parameter A^group j which determines how many 
halos are grouped together before dividing into subsamples based on Zf. If A^gioup 
is too small, one expects to introduce sample variance, in that halos are scattered 
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Figure Al. The colored dashed curves show Phm/bcft^i^—oPmm — 1 for 
= 200,100,-100,-200 (top to bottom, blue, green, red, light blue) for 
the high Zf subsample with M > 2 X 10^^h~^MQ at 2 = (left panel) and 
z = 0.8 (right panel). The colored solid lines show the 5 parameter fit to the 
modes below k = 0.03 h Mpc'^ for each value of /^f ^- For k > 0.03 h Mpc'^ 
we show the best fit values of be for each f^f^^ simulation. In the right panel, 
bo appears anti-correlated with /j^l®. 

across boundaries because each A'gi-oup set of halos is a finite sampling of the true Zf 
distribution. However, if iVgioup is too large, then one will introduce a spurious Zf 
dependence through the dependence of Pz^ on halo mass. In Figure IA3I we compare 
AAng values for A^group — 100 and iVgroup = 10000, and demonstrate that our results 
are insensitive to this choice. 
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Figure A2. Same as Figure [S] with / = 0.5 in tlie left panel, and / = 0.75 in 
the right panel. We compare fits to AAnq with five parameters (green, larger 
errors) with one parameter (blue, smaller error bars and ofli'set from x by 0.01 for 
clarity) as described in the text. The black curve shows the ePS prediction. 





Figure A3. AA-^q{x) measured from subsamples defined with A^group = 
100 (green) and N group = 10000 (blue). The results are insensitive to this 
parameter entering how we select ^j^-dependent halo subsamples with identical 
mass functions. See the text for details. 



